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Nonlinear  independent  component  analysis  is  combined  with 
diffusion-map  data  analysis  techniques  to  detect  good  observ¬ 
ables  in  high-dimensional  dynamic  data.  These  detections  are 
achieved  by  integrating  local  principal  component  analysis  of  sim¬ 
ulation  bursts  by  using  eigenvectors  of  a  Markov  matrix  describing 
anisotropic  diffusion.  The  widely  applicable  procedure,  a  crucial 
step  in  model  reduction  approaches,  is  illustrated  on  stochastic 
chemical  reaction  network  simulations. 

slow  manifold  |  dimensionality  reduction  |  chemical  reactions 

Evolution  of  dynamical  systems  often  occurs  on  two  or  more 
time  scales.  A  simple  deterministic  example  is  given  by  the 
coupled  system  of  ordinary  differential  equations  (ODEs) 

du/dt  =  a(u,v),  [1] 

dv/dt  =  x~1^(u,v),  [2] 

with  the  small  parameter  0  <  t  1,  where  a(u,  v)  and  |3(w,  v)  are 
0(1).  For  any  given  initial  condition  (uq,vq),  already  at  t  =  O(t) 
the  system  approaches  a  new  value  ( uo,v ),  where  v  satisfies  the 
asymptotic  relation  $(uo,v)  =  0.  Although  the  system  is  fully 
described  by  two  coordinates,  the  relation  |3(t/,v)  =  0  defines 
a  slow  one-dimensional  manifold  which  approximates  the  slow 
dynamics  for  t  ^  x.  In  this  example,  it  is  clear  that  v  is  the  fast 
variable  whereas  u  is  the  slow  one.  Projecting  onto  the  slow  man¬ 
ifold  here  is  rather  easy:  The  fast  foliation  is  simply  “vertical”, 
i.e.  u  =  const.  However,  when  we  observe  the  system  in  terms  of 
the  variables  x  =  x(u,v)  and  y  =  y(u,v )  which  are  unknown  non¬ 
linear  functions  of  u  and  v,  then  the  “observables”  x  and  y  have 
both  fast  and  slow  dynamics.  Projecting  onto  the  slow  manifold 
becomes  nontrivial,  because  the  transformation  from  (x,y)  to  (u,  v) 
is  unknown.  Detecting  the  existence  of  an  intrinsic  slow  manifold 
under  these  conditions  and  projecting  onto  it  are  important  in  any 
model  reduction  technique.  Knowledge  of  a  good  parametrization 
of  such  a  slow  manifold  is  a  crucial  component  of  the  equation-free 
framework  for  modeling  and  computation  of  complex/multiscale 
systems  (1-3). 

Principal  component  analysis  (PCA,  also  known  as  POD)  (4-6) 
has  traditionally  been  used  for  data  and  model  reduction  in  con¬ 
texts  ranging  from  meteorology  (7)  and  transitional  flows  (8)  to 
protein  folding  (9,  10);  in  these  contexts  the  PCA  procedure  is 
used  to  detect  good  global  reduced  coordinates  that  best  capture 
the  data  variability.  In  recent  years,  diffusion  maps  (11-17)  have 
been  used  in  a  similar  spirit  to  detect  low-dimensional,  nonlinear 
manifolds  underlying  high-dimensional  datasets. 

In  this  paper,  we  integrate  ensembles  of  local  PCA  analyses 
in  the  diffusion-map  framework  to  enable  the  detection  of  slow 
variables  in  high-dimensional  data  arising  from  dynamic  model 
simulations.  The  proposed  algorithm  is  built  along  the  lines  of  the 
nonlinear  independent  component  analysis  method  recently  intro¬ 
duced  in  ref.  18.  The  approach  takes  into  account  the  time  depen¬ 
dence  of  the  data,  whereas  in  the  diffusion-map  approach  the 
time  labeling  of  the  data  points  is  not  included.  We  demonstrate 


our  algorithm  for  stochastic  simulators  arising  in  the  context  of 
chemical/biochemical  reaction  modeling. 

Multiscale  Chemical  Reactions:  A  Toy  Example 

Consider  the  reversible  chemical  reaction  [a  dimerization,  which 
is  a  part  of  several  biochemical  mechanisms  (19, 20)]  involving  two 
molecular  species  X  and  Y, 

k\ 

x+x^y,  [3] 

k2 

where  k\  and  k2  are  the  forward  and  backward  rate  constants. 
The  probability  that  an  additional  molecule  of  type  Y  is  produced 
from  twoX  molecules  (respectively,  two  molecules  ofX  produced 
from  one  molecule  of  Y)  in  an  infinitesimally  small  time  interval 
[t,t+dt]  is  k\X(t)(X(t)  —  1  )dt  (respectively,  k2Y(i)dt),  where X(t) 
and  Y (t)  are  the  number  of  molecules  of  type  X  and  Y  at  time  t 
(21).  The  chemical  reaction  in  Eq.  3  satisfies  the  stoichiometric 
conservation  law 

X(t)  +  2Y(0  =  const,  [4] 

so  that  the  state  vector  [X(t),Y(t)]  is  restricted  to  a  line  in  the 
phase  plane.  We  now  couple  the  chemical  reaction  in  Eq.  3  with  a 
slow  production  of  X  molecules  from  an  external  source 

k3 

0  X,  [5] 

where  in  Eq.  5  means  that  the  probability  of  the  external  produc¬ 
tion  of  an  additional  molecule  of  typeX  in  an  infinitesimally  small 
time  interval  [ t,  t  +  dt]  is  k3dt;  the  rate  constants  and  the  initial 
state  are  chosen  in  such  a  way  that  the  production  process  in  Eq. 
5  is  much  slower  than  the  dimerization  reactions  in  Eq.  3.  This  is 
the  case,  for  example,  for  the  following  choice  of  parameters: 

X(0)  =  100,  Y(0)  =  100,  k\  =  1,  k2  =  100,  k3  =  50.  [6] 

The  average  time  to  produce  an  additional  X  molecule  is  k^1  = 
0.02,  whereas  the  average  times  for  the  forward  and  backward 
dimerization  are  (&iX(0)(X(0)  -  l))-1  «  1(T4  and  (k2Y( 0))”1  = 
10-4.  This  finding  implies  that  bothX  and  Y  are  fast  variables;  yet 
their  linear  combination  X  +  2Y  is  a  slow  variable.  The  conser¬ 
vation  law  in  Eq.  4  no  longer  holds  since  production  was  added. 
Instead,X+2Y  is  slowly  growing.  To  confirm  this  fact,  we  simulate 
the  time  evolution  of  the  pair  [X(t),Y(t)]  by  using  the  Gillespie 
stochastic  simulation  algorithm  (SSA)  (21).  In  Fig.  1,  we  plot  the 
time  evolution  ofX,  Y  andX  +  2Y. 

This  finding  naturally  leads  to  the  following  question:  How  does 
one  detect  the  slow  variableX+2Y  from  data?  A  priori  knowledge 
that  we  seek  a  linear  combination  of  the  original  variables  lends 
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Fig.  1. 


chemical  system  in  Eqs.  3  and  5. 
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Toy  example.  The  time  evolution  of  X,  Y  and  X  +  2Y  given  by  the  stochastic  simulation  of  the 


itself  to  fitting  the  coefficients  of  such  a  combination.  Such  fitting 
is,  however,  not  possible  for  the  general  nonlinear  case. 

Short  Simulation  Bursts 

It  is  convenient  to  analyze  our  approach  in  the  diffusion  limit,  for 
which  the  simulation  is  well  approximated  by  a  stochastic  differ¬ 
ential  equation  (SDE).  The  chemical  Langevin  equation  for  the 
time  evolution  of  X  and  Y,  which  is  formally  derived  from  the 
corresponding  chemical  master  equation,  is  given  in  the  Ito  form 
by  refs.  22-24 

dr  =  ( 2k3y  —  2k\x(x  —  1)  +  k3)dt 

—  2-yJk\x(x  -  1)  dwi  +  2 y/kyy  dw2  +  \fk3  dw3,  [7] 
dy  =  (k\x(x  —  1)  —  k^y)  d t 

+  yjkix(x  -  1)  dwi  -  y[k?y  d w2,  [8] 

where  W[  (i  =  1, 2, 3)  are  standard  independent  Brownian  motions. 
The  approximation  in  Eqs.  7  and  8  is  also  characterized  by  a  time 
scale  separation  and  possesses  the  slow  variable  x+2y;  multiplying 
Eq.  8  by  two  and  adding  it  to  Eq.  7  gives 

d(x  +  2y)  =  £3  dt  +  y^3  dw3.  [9] 

Eq.  9  shows  that  the  approximated  stochastic  dynamics  of  x  +  2y 
are  decoupled  from  the  individual  dynamics  of*  andy,  as  expected 
from  Eqs.  3  and  4. 

The  Euler-Maruyama  method  for  Eqs.  7  and  8  suggests  that  in  a 
time  step  At,  the  state  vector  [x(t),y(t)]  propagates  to  the  random 
state  vector  [x(t  +  A t),y(t  +  At)] 

x(t  +  At)  «  x(t)  +  (2kyy{t)  —  2kix(t)(x(t)  —  1)  +k3)  At 

-  2-y/ ( lcpc(t)(x(t )  -  1)  +  k-2y(t))  Z|  +  v^3-Z2, 
y(t  +  At)  « y(t )  +  {k\x(t)(x(t)  -  1)  -  k2y(t))  At 
+  J (kix(t)(x(t)  -  1 )+%(/)) Zi, 

where  Zi,Z2  ~  A/"(0,  At)  are  independent,  normally  distributed 
random  variables  with  zero  mean  and  variance  At  (Z\  and  Z2  cor¬ 
respond  to  the  dwi  and  dw2  terms,  respectively,  in  Eqs.  7  and  8), 
which  means  that  if  we  were  to  run  many  simulations  for  a  short 
time  step  At,  all  starting  at  [x(t),y(t)\,  the  trajectories  would  end 
up  at  random  locations  forming  a  “point”  cloud  in  the  phase  plane. 
The  point  cloud  has  a  bivariate  normal  distribution,  whose  center 
is  located  at  /r  =  [/xx,/xy]T,  given  by 

fix  =  x(t)  +  (2 k^y(t)  -  2kix(t)(x(t)  -  1)  +  k3)  At, 
fiy  =  y(t)  +  ( kxx(t)(x(t )  -  1)  -  %(0)  At> 


and  whose  two-by-two  covariance  matrix  E  is 

E  =  BBr, 

where 

b  =  Va7  (  -2)g)  ES±  Yh  \ 

V  Jk\x(t)(x(t)  -  1)  +  k?y(t)  0  )  ' 

The  shape  of  the  point  cloud  is  an  ellipse  because  the  level  lines 
of  the  probability  density  function 

= yyim' "p  \J^X -■ -■ 

are  ellipses  (x  =  [x,y]T).  When  there  is  a  separation  of  time  scales, 
the  ellipses  are  thin  and  elongated.  For  example,  for  the  set  of  para¬ 
meters  given  in  Eq.  6,  the  eigenvalues  of  E  for  [x,y]  =  [100, 100] 
are  o\  ~  105  At  and  ~  10  A^.  These  approximations  mean  that 
the  long  axis  of  the  ellipse  is  two  orders  of  magnitude  longer 
than  the  short  axis  (ai/a2  ~  102).  The  eigenvector  correspond¬ 
ing  to  ai  is  approximately  [—2, 1]T,  pointing  in  the  direction  of  the 
fast  dynamics  on  the  line  x  +  2 y  =  const.  The  second  eigenvec¬ 
tor  is  approximately  [l,2]r,  pointing  in  the  direction  of  the  slow 
dynamics. 

The  eigen-decomposition  of  the  covariance  matrix  is  simply 
the  PCA  of  the  local  point  cloud  generated  by  the  short  simu¬ 
lation  burst.  We  produce  many  short  simulation  bursts  starting 
at  different  initialization  points  [x,y].  For  each  burst,  we  perform 
the  PCA  and  estimate  its  covariance  matrix  2^).  The  principal 
components  of  2^)  are  the  local  directions  of  the  rapidly  chang¬ 
ing  variables  at  [x,y],  whereas  components  with  small  eigenvalues 
correspond  to  the  slow  variables. 

We  wish  to  piece  together  the  locally  defined  components  into 
globally  consistent  coordinates.  The  toy  model  in  Eqs.  3-5  presents 
no  special  difficulty  because  the  principal  components  of  2^) 
are  approximately  [—2, 1]  and  [1,2]  everywhere  (independent  of 
[x,y]).  In  general,  however,  the  slow  variable  may  be  some  com¬ 
plicated  nonlinear  function  of  the  state  variables.  In  such  cases,  it 
is  not  trivial  to  find  a  globally  consistent  slow  coordinate. 

Anisotropic  Diffusion  Maps 

To  integrate  the  local  information  into  global  coordinates,  we  use 
anisotropic  diffusion  maps  (ADM),  introduced  in  ref.  18.  Sup¬ 
pose  u  =  u(x,y)  =  x +  2 y  (respectively,  v  =  v(x,y)  =  —2x  +y)  are 
the  slowly  changing  (respectively,  the  rapidly  changing  variables). 
Together,  they  define  a  mapg  :  (x,y)  ( u ,  v)  from  the  observable 
state  variables*  andy  to  the  “dynamically  meaningful”  coordinates 
u  and  v.  Alternatively,  the  inverse  map  /  =  g~x  :  ( u,v )  (x,y) 

is  given  by*  =  x(u,v)  and  y  =  y(u,v).  The  point  cloud  in  the 
observable  (x,y)  plane,  generated  by  the  short  bursts,  is  the  image 
under  /  of  a  similar  point  cloud  in  the  inaccessible  ( u ,  v)  plane. 
The  slow  manifold  (curve)  in  the  (x,y)  plane  can  be  thought  of 
as  the  image  of  the  u  axis,/(w,0)  =  [x(u,0),y(u,  0)].  The  ellipses 
in  the  ( u,v )  plane  are  also  thin  and  elongated,  and  they  share  an 
important  property:  They  all  have  the  v  axis  as  their  long  axis  and 
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the  u  axis  as  their  short  axis,  due  to  the  separation  of  time  scales. 
The  ratio  between  the  eigenvalues  of  E  defines  a  small  parameter 
0  <  x2  «  1  that  measures  the  time  scale  separation.  In  other 
words,  the  change  in  u  in  a  small  time  step  At  is  typically  x  times 
smaller  than  the  amount  of  change  in  v.  The  parameter  x  =  t(u) 
can  also  be  a  function  of  u ,  allowing  the  possibility  of  different 
variability  of  the  rapid  dynamics  for  different  values  of  u.  This 
possibility  suggests  the  need  to  define  the  scaled  variable  vT  =  xv. 
This  scaling  contracts  the  elongated  ellipse  in  the  ( u ,  v)  plane  into 
a  circle  in  the  ( u,vx )  plane. 

Now  that  we  have  shown  how  to  identify  ellipses  in  the  observ¬ 
able  (x,y)  space  that  are  images  of  circular  disks  in  the  inaccessible 
(u,  vx)  space,  we  are  in  position  to  use  the  result  of  ref.  18,  which 
relates  the  anisotropic  graph  Laplacian  in  the  observable  space 
with  the  (isotropic)  graph  Laplacian  in  the  inaccessible  space.  We 
formulate  our  method  in  a  general  setting.  Then  we  apply  it  to  the 
toy  example. 

The  construction  of  the  ADM  is  performed  as  follows.  Suppose 
x®  e  Rm,  i  =  1, . . . , N,  are  N  data  points  in  an  M-dimensional 
data  space.  For  every  data  point  x®  =  [x\\  x®, . . . ,  x^],  i  = 
1, . . . ,  N,  we  generate  an  ensemble  of  short  simulation  bursts  ini¬ 
tialized  at  the  data  point,  i.e.  x(0)  =  [xfyO),  X2(0), . . . ,  xm{ 0)]  = 
x®.  We  collect  the  statistics  of  the  simulated  trajectories  after 
a  short  time  period  At.  In  particular,  we  compute  the  averaged 

position  ti(i)  =  [fif, 

uf  =(xj(At)\x(0)=xU),  j  =1,...  ,M,  [10] 

and  the  elements  of  the  covariance  matrix 


by 

°jk  =  ^ [{xj(&t)xk(At)  | x(0)  =  x®)  -  iif  tif],  [11] 

where  the  notation  (•)  stands  for  statistical  averaging  over  many 

simulated  trajectories.  For  each  data  point we  calculate  E^  , 
the  inverse  of  the  covariance  matrix.  We  define  a  symmetric 
E-dependent  squared  distance  between  pairs  of  data  points  in 
the  observable  space  RM 

(%(x®,x®) 

=  t(x®  -x«)r  ((V'fy1  +  (V0)”1)  (*(i)  -xW).  [12] 

Note  that  for  the  toy  model  in  Eqs.  3-5  the  distance  d x  is  a  second 
order  approximation  of  the  Euclidean  distance  in  the  inaccessible 
(u,vx)~  space 

d\(x®,x®)  ~  (u®  -  u^)2  +  x2(v^  -  v^)2.  [13] 


on  the  slow  manifold  is  close)  have  Wy  close  to  1,  whereas  distant 
points  have  Wy  close  to  0.  Next,  we  define  a  diagonal  N  x  N  nor¬ 
malization  matrix  D  whose  values  are  given  by  the  row  sums  of  W 

N 

Dii  =  YJWik. 

k=  1 

We  then  compute  the  eigenvalues  and  right  eigenvectors  of  the 
row  stochastic  matrix 

A  =  D  *W,  [16] 

which  can  be  viewed  as  a  Markov  transition  probability  matrix 
for  a  jump  process  over  the  data  points  {x^}^Lv  The  discrete 
jump  process  converges  in  the  limit  of  N  oo  and  8  0  to 

a  continuous  diffusion  process  over  the  observable  data  manifold. 
The  diffusion  process  is  anisotropic  due  to  the  metric  d s,  so  that 
the  diffusion  coefficient  changes  with  direction.  Therefore,  the 
eigenvectors  of  A  are  discrete  approximations  of  the  continuous 
eigenfunctions  of  the  anisotropic  differential  diffusion  genera¬ 
tor  over  the  observable  manifold.  The  approximation  in  Eq.  14 
implies  that  the  long  time  behavior  ( t  x)  of  the  anisotropic  dif¬ 
fusion  process  over  the  observable  manifold  can  be  approximated 
to  leading  order  in  x  as  an  isotropic  diffusion  process  over  the 
slow  u  manifold.  Equivalence  of  the  long  time  behavior  suggests 
that  the  low-frequency  eigenfunctions  of  the  two  diffusion  gener¬ 
ators  are  approximately  equal.  It  follows  that  the  eigenvectors  of 
A  approximate  the  eigenfunctions  of  isotropic  diffusion  generator 
(the  Laplacian  or  the  backward  Fokker-Planck  operator)  over  the 
slow  u  manifold.  These  eigenfunctions  are  functions  of  the  slow  ( u ) 
variables  that  do  not  depend  on  the  fast  (v)  variables.  Hence,  the 
low  order  eigenvectors  of  A  give  an  approximate  parametrization 
of  the  slow  manifold. 

As  discussed  in  refs.  12  and  25-27,  the  leading  eigenvectors 
may  be  used  as  a  basis  for  a  low-dimensional  representation  of 
the  data.  To  compute  those  eigenvectors,  we  use  the  fact  that 
A  =  D~1/2SD1/2  where  S  =  D-1/2WD-1/2  is  a  symmetric  matrix. 
Hence,  A  and  S  are  similar  and  thus  have  the  same  spectrum. 
Because  S  is  symmetric,  it  has  a  complete  set  of  eigenvectors  q7, 
j  =  0, . . . ,  N  —  1,  with  corresponding  eigenvalues 

>  M  >  •  •  •  >  h'N-i-  [17] 

The  right  eigenvectors  of  A  are  given  by 

u;  =  D^q  j.  [18] 

Because  A  is  a  Markov  matrix,  all  its  eigenvalues  are  <1,  with 
largest  eigenvalue  Xq  =  1  and  a  corresponding  trivial  eigenvector 
uo  =  [1, 1, . . . ,  1].  We  define  the  low^-dimensional  representation 
of  the  state  vectors  by  the  following  ADM 

%  :  x(l)  ->  [uf,  uf, uf\,  [19] 


Because  x  is  a  small  parameter,  d 2  is  controlled  by  the  difference 
in  the  slow  coordinate.  The  approximation  in  Eq.  13  is  also  valid 
in  higher  dimensions,  where  there  may  be  more  than  one  slow 
coordinate  ( u )  and  several  fast  coordinates  (v)  and  the  ellipse  is 
replaced  by  an  ellipsoid.  In  such  cases, 

«  ||«®  -  ||2  + 12  ||v(l)  -  v®  ||2.  [14] 

Therefore,  the  ADM  based  on  the  “dynamic  proximity”  ds  approx¬ 
imates  the  Laplacian  on  the  slow  manifold.  We  construct  an  N  x  N 
weight  matrix  W 


Wy  =  exp 


(  d\{x^l\x^) 

i  ^ 


[15] 


where  e  >  0  is  the  single  parameter  of  the  method.  The  elements 
of  the  matrix  W  are  all  <1.  Nearby  points  (i.e.,  their  projection 


that  is,  the  point  x®  is  mapped  to  a  vector  containing  the  zth  coor¬ 
dinate  of  each  of  the  first  n  leading  eigenvectors  of  the  matrix 

A.  The  variables  u®,  u®, . . . ,  (which  are  defined  on  the  data 
points)  are  the  candidate  slow  variables  that  we  were  looking  for. 

Application  of  ADM  to  the  Toy  Example 

We  use  N  =  2000  data  points  xu>  =  [xf,xf]  =  [X®,Y®], 
i  =  1, . . .  ,2000,  uniformly  sampled  from  the  illustrative  trajec¬ 
tory  of  Fig.  1  (in  fact,  the  trajectory  in  Fig.  1  is  visualized  using 
these  2000  data  points).  For  every  data  point  x®  = 
i  =  1, . . . ,  2000,  we  run  107  replicas  of  stochastic  simulations  ini¬ 
tialized  at  the  data  point  for  time  At  =  10~4.  We  estimate  [if*  and 
crjk  ,  i  =  1, . . . ,  2000,;  =  1, 2,  k  =  1, 2  by  Eqs.  10  and  11  as  an  aver¬ 
age  over  107  realizations.  For  each  data  point  x®  =  [X^l\Y^], 
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Fig.  2.  Toy  example.  (Left)  The  dataset  with  each  point  colored  according  to  Uj.  (Center)  Vector  u^  as  a  function  of  X  +  2 Y.  (Right)  Vector  ui  as  a  function 
of  X. 


we  also  calculate  the  inverse  covariance  matrix  and  the  symmetric 
E-dependent  squared  distance  d\  (x® ,  x® )  by  Eq.  12.  We  construct 
a  2000  x  2000  weight  matrix  W  by  Eq.  15  for  e  =  0.1  and  a  matrix 
A  by  Eq.  16.  We  compute  the  leading  eigenvectors  uj  of  A  by  Eq. 
18.  In  Fig.  2,  we  plot  our  dataset  where  the  points  are  colored 
according  to  the  first  nontrivial  eigenvector  m.  We  see  that  the 
eigenvector  ui  gives  a  good  description  of  slow  dynamics  of  this 
system.  The  slow  dynamics  are  given  by  function  X  +  2Y  as  can 
be  seen  in  the  Right  frame  of  Fig.  1.  The  plot  of  ui  vs.  X  +  2Y  is 
shown  in  Center  frame  of  Fig.  2.  We  again  confirm  that  we  obtained 
a  good  slow  description  of  the  system.  Finally,  plotting  the  eigen¬ 
vector  ui  vs.  X  confirms  thatX  is  not  a  good  slow  variable  {Right 
frame  of  Fig.  2). 

Here  we  used  a  simulation  burst  of  107  trajectories.  The  num¬ 
ber  of  simulation  “bursts”  needed  to  construct  a  distance  metric 
based  on  the  covariance  in  a  high-dimensional  system  depends 
on  the  dimensionality  and  the  desired  degree  of  approximation. 
The  central  limit  theorem  suggests  that  the  estimated  covariance 
matrix  entries  converge  with  the  square-root  number  of  simu¬ 
lated  trajectories.  However,  the  convergence  of  the  eigenvalues 
and  eigenvectors  (principal  components)  of  the  covariance  matrix 
depends  on  the  dimensionality  M  (see,  e.g.  ref.  28)  as  crossings  of 
eigenvalues  may  occur. 

Oscillating  "Half-Moons" 

Next,  consider  the  system  of  stochastic  differential  equations 

d u  =  0i  dt  +  d2  dwi,  [20] 

dv  =  <33(1  —  v)  dt  +  04  dw2,  [21] 

where  at,  i  =  1,2, 3, 4,  are  constants  and  Wi,  i  =  1,2  are 
independent  8 -correlated  white  noises  (Wiener  processes).  We 
consider  Eqs.  20  and  21  together  with  the  following  nonlinear 
transformation  of  variables 

x  =  vcos(u  +  v  —  1),  y  =  v  sin(u  +  v  —  1).  [22] 


We  will  assume  that  the  observables  x  andy  are  the  actual  observ¬ 
ables,  whereas  u  and  v  are  unknown.  We  choose  the  values  of 
parameters  as  a\  =  02  =  10-3,  03  =  04  =  10_1.  The  illustrative 
trajectory  that  starts  at  [x(0),y(0)]  =  [1, 0]  is  plotted  in  the  Left 
frame  of  Fig.  3.  The  trajectory  is  colored  according  to  time.  We 
run  simulations  for  a  longer  time  8  x  104,  which  accounts  for  about 
12-13  rotations,  and  record  2000  data  points  at  equidistant  time 
intervals  of  length  8  x  104/2000  =  40.  This  dataset  is  plotted  in 
the  Center  frame  of  Fig.  3.  Again,  points  are  colored  according  to 
time.  We  clearly  see  that  there  is  no  correlation  between  time  and 
the  slow  variable  (which  is  u  MOD  2n)  because  of  oscillations. 

To  apply  the  ADM,  we  run  106  replicas  of  stochastic  simula¬ 
tions  initialized  at  each  data  point  x®  =  [x^,y^]  for  a  time  step 
At  =  0.1  and  estimate  fip  and  0®,  i  =  1,  ...,2000,  j  =  1,2, 
k  =  1, 2  by  Eqs.  10  and  11  as  an  average  over  106  realizations. 
For  each  data  point  =  [x^l\y^],  we  also  calculate  the  inverse 
covariance  matrix  and  the  symmetric  E-dependent  squared  dis¬ 
tance  d\{x^l\x^)  by  Eq.  12.  Next,  we  have  to  choose  the  value  of 
parameter  e.  To  do  that,  we  construct  the  e-dependent  2000  x  2000 
weight  matrix  W  =  W(e)  by  Eq.  15  for  several  values  of  e.  Then 
we  compute 

N  N 

l(8)  =  EEw  ™ 

i=  1  7=1 

The  function  L(e)  is  plotted  in  the  Right  frame  of  Fig.  3  (it  is  a 
log-log  plot)  (29).  It  clearly  has  two  constant  asymptotes  when 
e  ->►  0  and  e  ->  00;  as  we  expect,  these  asymptotes  are  smoothly 
connected,  by  an  approximately  straight  line  of  slope  d  in  a  log- 
log  plot,  where  d  is  the  dimension  of  the  slow  manifold.  Thus,  the 
log-log  plot  of  L{ e)  suggests  to  choose  e  where  the  log-log  graph 
of  L(e)  appears  linear.  We  choose  8  =  6.  We  form  A  (by  Eq.  16) 
and  compute  its  few  leading  eigenvectors  u j  by  Eq.  18.  The  first 
nontrivial  eigenvector  ui  then  describes  the  slow  dynamics  of  the 


Fig.  3.  Oscillating  half  moons.  The  short  illustrative  trajectory  of  Eqs.  20-22  which  starts  at  [x(0),y(0)]  =  [1,0].  (Left)  The  trajectory  is  colored  according  to 
time.  The  representative  dataset  sampled  at  equal  time  steps  from  a  longer  stochastic  simulation.  (Center)  The  points  are  colored  according  to  time.  (Right) 
Plot  of  L( s)  given  by  Eq.  23. 
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Fig.  4.  Oscillating  half  moons.  (Left)  The  dataset  with  each  point  colored  according  to  u^ .  (Center)  Vector  ui  as  a  function  of  x.  (Right)  Vector  u^  as  a  function 
of  u  MOD  2u. 


system.  The  dataset  (colored  by  the  values  of  ui )  is  plotted  in  Fig.  4 
( Left  frame).  We  see  that  the  ADM  provides  a  good  description  of 
the  slow  dynamics.  Plotting  ui  against  the  observable  x  confirms 
that  the  latter  is  not  a  good  observable  ( Center  frame  of  Fig.  4). 
The  slow  variable  is  given  as  a  nonlinear  transformation  ofx  andy 
which  can  be  computed  by  inverting  Eq.  22  locally.  It  is  basically  a 
function  of  u  MOD  lit.  The  eigenvector  ui  is  plotted  against  the 
slow  variable  u  MOD  2tt  in  the  Right  frame  of  Fig.  4.  We  again 
confirm  that  we  recovered  the  slow  dynamics  correctly. 

Inherently  Nonlinear  Chemical  Reactions 

We  consider  the  following  set  of  chemical  reactions 

x  A  x  +  z,  y  +  z  -L  y,  [24] 

0  A  y,  y  A  0,  [25] 

0  ■%  X.  [26] 

The  first  two  reactions  in  Eq.  24  are  production  and  degradation 
of  Z  (catalyzed  by  X  and  Y,  respectively).  The  production  and 
degradation  of  Z  is  assumed  to  be  happening  on  a  fast  time  scale. 
The  reactions  in  Eq.  25  are  production  and  degradation  of  Y.  They 
are  assumed  to  occur  on  an  intermediate  time  scale  (i.e.  slower 


than  the  fast  time  scale,  but  faster  than  the  slow  time  scale).  The 
reaction  in  Eq.  26  is  production  of  X,  which  is  assumed  to  be  slow. 
We  choose  the  values  of  the  rate  constants  as 

k\  =  1000,  k2  =  1,  k3  =  40,  k4  =  1,  k5  =  1.  [27] 

This  choice  of  rate  constants  guarantees  that  the  reactions  in 
Eq.  24  are  the  fastest,  the  reactions  in  Eq.  25  happen  on  a  slower 
time  scale,  and  the  reaction  in  Eq.  26  is  the  slowest.  The  model  in 
Eqs.  24-26  is  approximated  by  the  ODE  system  for  the  0(1)  vari¬ 
ables  x  =  X/100,y  =  Y/40  andz  =  Z/ 2500  as  follows:  dr/d t  = 
&5/100,  dy/dt  =  £3/40  -  kyy ,  dz/dt  =  100/:ix/2500  -  A0k2yz.  By 
using  the  parameter  values  in  Eq.  27,  we  obtain  dx/dt  =  x/100, 
dy/dt  =  1  —  y,  dz/dt  =  40(x  —yz).  The  quasiequilibrium  approx¬ 
imation  in  the  z  equation  (fastest)  is  z  =  x/y,  which  gives  rise 
to  the  “half-moon  shaped”  profile  (hyperbola  +  noise)  dynam¬ 
ics  in  the  Y-Z  plane.  The  variable  y  changes  on  a  faster  time 
scale  than  x.  Roughly  speaking,  the  fluctuations  in  y  lead  to  the 
dynamics  in  z  according  to  the  formula  z  =  x/y,  where  x  changes 
very  slowly,  as  illustrated  in  Fig.  5.  We  initialize  the  system  at 
[X(0),Y(0),Z(0)]  =  [100,40,2500]  and  simulate  the  time  evo¬ 
lution  using  the  Gillespie  stochastic  simulation  algorithm.  Fig.  5 
shows  the  time  evolution  ofX  (Upper  Left  frame),  Y  (Upper  Center 
frame),  and  Z  (Upper  Right  frame).  The  same  trajectory  plotted 
in  the  Y-Z  plane  is  given  in  the  Lower  frames  of  Fig.  5.  We  plot 
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Fig.  5.  Inherently  nonlinear  chemical  reactions.  (Upper)  The  time  evolution  of  X  (Upper  Left),  Y  (Upper  Center)  and  Z  (Upper  Right)  given  by  the  stochastic 
simulation  of  the  chemical  system  in  Eqs.  24-26.  The  same  trajectory  (2000  data  points,  saved  at  equal  time  intervals  At  =  0.05  apart)  plotted  in  the  Y-Z  plane 
is  shown  in  the  lower  frames.  (Lower)  We  color  the  points  according  to  time  (Lower  Left)  and  according  to  the  number  of  X  molecules  (Lower  Center).  To 
emphasize  the  strength  of  our  approach,  we  randomize  the  order  of  the  data  points  -  we  color  the  resulting  data  set  according  to  the  order  in  the  new  list 
(Lower  Right). 
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Fig.  6.  Inherently  nonlinear  chemical  reactions.  (Left)  The  dataset  in  the  Y-Z  plane  with  each  point  colored  according  to  u i .  (Center)  Vector  u i  as  a  function 
of  X.  (Right)  Vector  1/1  as  a  function  of  Y. 


2000  data  points  lying  on  this  trajectory  colored  by  time  {Lower 
Left  frame).  In  the  Lower  Center  frame  of  Fig.  5,  we  provide  the 
similar  Y -Z  plot  where  the  data  points  are  colored  according  to 
the  value  ofX. 

The  set  of  2000  data  points  (plotted  in  the  Lower  Right  frame  of 
the  Fig.  5)  is  the  input  of  the  diffusion-map  approach.  To  empha¬ 
size  the  strength  of  our  approach,  the  data  points  are  ordered 
randomly  in  the  inputting  dataset.  In  our  model,  the  slow  vari¬ 
able  X  is  a  nondecreasing  function  of  time  t  (see  Fig.  5  Upper 
Left  frame).  Consequently,  the  dataset  recorded  from  the  stochas¬ 
tic  simulation  is  ordered  according  to  the  slow  variable.  In  more 
complicated  chemical  examples  [e.g.  problems  with  oscillations 
(30)],  or  the  oscillating  half-moons  from  the  previous  example, 
there  is  no  obvious  relation  between  the  “dynamic  proximity” 
of  data  points  and  the  order  in  which  they  are  recorded.  Our 
approach  works  in  more  complicated  situations  because  the  ADM 
is  independent  of  the  order  of  the  inputting  data  points. 

We  use  short  bursts  of  time  At  =  5  x  10-4  (which  correspond  to 
approximately  100  Gillespie  SSA  time  steps)  of  stochastic  sim¬ 
ulations  initialized  at  the  N  =  2000  data  points  from  Fig.  5 
{Lower  Right  frame).  For  every  data  point  X®  =  [X®,Y(l 11\  Z®], 
i  =  1, ...  ,N,  we  run  106  replicas  of  stochastic  simulations  ini¬ 
tialized  at  the  data  point  to  estimate  the  covariance  matrix  5^. 
We  use  e  =  1.  In  the  Right  frame  of  Fig.  6,  we  plot  our  dataset 
[given  in  Fig.  5  {Lower  Right  frame)]  and  we  color  the  data 
points  according  to  the  first  nontrivial  eigenvector  m.  We  see  that 
the  eigenvector  ui  gives  a  good  description  of  slow  dynamics  of 
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the  system  in  Eqs.  24-26.  The  slow  dynamics  can  be  described  by 
the  variable  X,  as  can  be  seen  in  the  Upper  Left  frame  of  Fig.  5.  The 
plot  of  ui  vs.  X  is  shown  in  the  Center  frame  of  Fig.  6.  We  again 
confirm  that  we  obtained  a  good  description  of  the  slow  dynamics 
of  the  system.  Finally,  plotting  the  eigenvector  ui  vs.  Y  confirms 
that  Y  is  not  a  good  slow  variable  {Right  frame  of  Fig.  6). 

Summary 

Finding  a  reduced  model  for  dynamical  systems  with  a  large  num¬ 
ber  of  degrees  of  freedom  is  of  great  importance  in  many  fields. 
Dimensional  reduction  methods  often  use  similarity  measures 
between  different  states  of  the  dynamical  system  to  reveal  its 
low-dimensional  structure.  Those  methods  are  limited  when  the 
similarity  measure  does  not  take  into  account  the  time-labeling 
of  the  states.  We  encode  the  time  dependence  into  an  anisotropic 
similarity  measure  by  using  short  bursts  of  local  simulations.  The 
resulting  leading  eigenvectors  of  the  anisotropic  diffusion  map 
approximate  the  eigenfunctions  of  the  Laplacian  over  the  mani¬ 
fold  corresponding  to  the  dynamically  meaningful  slowly  varying 
coordinates.  We  demonstrated  the  usefulness  of  the  ADM  in  ana¬ 
lyzing  dynamical  systems  by  its  successful  recovery  of  meaningful 
coordinates  in  the  particular  case  of  multiscale  chemical  reactions. 
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